N-gram Parsing for Jointly Training a Discriminative Constituency Parser
نویسندگان
چکیده
Syntactic parsers are designed to detect the complete syntactic structure of grammatically correct sentences. In this paper, we introduce the concept of n -gram parsing, which corresponds to generating the constituency parse tree of n consecutive words in a sentence. We create a stand-alone n -gram parser derived from a baseline full discriminative constituency parser and analyze the characteristics of the generated n -gram trees for various values of n . Since the produced n -gram trees are in general smaller and less complex compared to full parse trees, it is likely that n -gram parsers are more robust compared to full parsers. Therefore, we use n -gram parsing to boost the accuracy of a full discriminative constituency parser in a hierarchical joint learning setup. Our results show that the full parser jointly trained with an n -gram parser performs statistically significantly better than our baseline full parser on the English Penn Treebank test corpus.
منابع مشابه
N - gram Parsing for Jointly Training a Discriminative Constituency
Syntactic parsers are designed to detect the complete syntactic structure of grammatically correct sentences. In this paper, we introduce the concept of n-gram parsing, which corresponds to generating the constituency parse tree of n consecutive words in a sentence. We create a stand-alone n-gram parser derived from a baseline full discriminative constituency parser and analyze the characterist...
متن کاملSelf-training a Constituency Parser using n-gram Trees
In this study, we tackle the problem of self-training a feature-rich discriminative constituency parser. We approach the self-training problem with the assumption that while the full sentence parse tree produced by a parser may contain errors, some portions of it are more likely to be correct. We hypothesize that instead of feeding the parser the guessed full sentence parse trees of its own, we...
متن کاملTree Kernels-based Discriminative Reranker for Italian Constituency Parsers
English. This paper aims at filling the gap between the accuracy of Italian and English constituency parsing: firstly, we adapt the Bllip parser, i.e., the most accurate constituency parser for English, also known as Charniak parser, for Italian and trained it on the Turin University Treebank (TUT). Secondly, we design a parse reranker based on Support Vector Machines using tree kernels, where ...
متن کاملA Discriminative Model for Joint Morphological Disambiguation and Dependency Parsing
Most previous studies of morphological disambiguation and dependency parsing have been pursued independently. Morphological taggers operate on n-grams and do not take into account syntactic relations; parsers use the “pipeline” approach, assuming that morphological information has been separately obtained. However, in morphologically-rich languages, there is often considerable interaction betwe...
متن کاملWord Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System
We present a constituency parsing system for Modern Hebrew. The system is based on the PCFG-LA parsing method of Petrov et al. (2006), which is extended in various ways in order to accommodate the specificities of Hebrew as a morphologically rich language with a small treebank. We show that parsing performance can be enhanced by utilizing a language resource external to the treebank, specifical...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Polibits
دوره 47 شماره
صفحات -
تاریخ انتشار 2013